Installing packages

If you don’t already have these packages installed, run this code

install.packages("plotly")
install.packages("ggplot2")
install.packages("dplyr")

Run this code, Required pacakges for activity

library(plotly)
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Introduction

Data visualization is a very important tool in the field of Data Science. It helps people share the findings of there research/analysis with a wide variety of people from different backgrounds through effective visuals. At Macalester, the most common data viz.package we are taught in R is ggplot. This package allows us to input a data frame and turn it into a graphic where the user can specify certain aesthetic parameters to create a desired visualization.

For this activity, we will be building upon our knowledge of data visualization and learn a new skill called plotly, “an Interactive web-based data visualization” that can be used in R and python. This package allows us to take in data and turn it into a interactive visualization that can enhance the message of the analysis. Through the completion of this activity and reflection points, the hope is that you will learn a new skill that you can add to your bag of tricks.

Before begining this activity, please look through this article that covers the basics and syntax of plotting with plotly. Throughout this activity, if anything is unclear, please look back at this reference for code help. - https://plotly-r.com/

Section 1: Basics

For this activity, we will be using the mtcars data set that is built into r.

glimpse(mtcars)
## Rows: 32
## Columns: 11
## $ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,…
## $ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,…
## $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16…
## $ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180…
## $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,…
## $ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.…
## $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18…
## $ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,…
## $ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,…
## $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,…
## $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,…

As mentioned above, we have commonly been taught to use ggplot to make visualizations, tt is an effective package that allows users a vast amount of options. Included below is a scatter plot using ggplot and the mtcars dataset.

ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point()+
  labs(
    x = "Weight (X thousand lbs)", y = "Miles per gallon",
    title = "Fuel effiency by weight", color = "Cylinders") +
  theme_minimal()

reflection

Answer here:

Section 1, part b

Just like ggplot, Plotly allows us to create visualizations, but with slightly different formatting. Below are several examples of common graph types—this time using Plotly.

As you run each cell, interact with the plots:

Try:

Scatter plot

plot_ly(
  data = mtcars, x = ~wt, y = ~mpg, type = "scatter", mode = "markers") %>%
  layout(title = "Scatter Plot: Weight vs. MPG", xaxis = list(title = "Weight (thousand lbs)"), yaxis = list(title = "Miles per Gallon"))

Histogram

plot_ly(
  data = mtcars, x = ~factor(cyl), type = "histogram") %>%
  layout( title = "Histogram of Cylinder Count", xaxis = list(title = "Number of Cylinders"), yaxis = list(title = "Count of Cars"))

Line chart

mtcars_sorted <- mtcars %>% arrange(hp)

plot_ly(
  data = mtcars_sorted, x = ~hp, y = ~mpg, type = "scatter", mode = "lines") %>%
  layout(title = "Line Plot: MPG Across Increasing Horsepower", xaxis = list(title = "Horsepower"), yaxis = list(title = "Miles per Gallon"))

3d plot

plot_ly(
  data = mtcars, x = ~wt, y = ~mpg, z = ~hp, color = ~factor(cyl), type = "scatter3d", mode = "markers") %>%
  layout(
    title = "3D Scatter Plot: Weight, MPG, and Horsepower", scene = list( xaxis = list(title = "Weight (thousand lbs)"), yaxis = list(title = "Miles per Gallon"), zaxis = list(title = "Horsepower")),
    legend = list(title = list(text = "Cylinders")))

reflection

Answer here:

Section 2: Making the graph interactive

Returning to the original ggplot scatterplot, there are two ways to convert it into an interactive Plotly visualization.

Approach 1. using ggplot again and letting plotly handle it. As you can see, it is the same code we used with the ggplot calls which creates this great interactive. If you hover over the points, you are able to see the what the points axis points are.

p <- ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point() +
  labs(
    x = "Weight (X thousand lbs)",
    y = "Miles per gallon",
    title = "Fuel efficiency by weight",
    color = "Cylinders"
  ) +
  theme_minimal()

# ggplotly(p)

Approach 2. This approach uses Plotly syntax from the start, giving more control over features like hover labels, legends, colors, and marker styling.

plot_ly(
  data = mtcars,
  x = ~wt,
  y = ~mpg,
  color = ~factor(cyl),
  colors = "Set1",
  type = "scatter",
  mode = "markers",
  marker = list(size = 10)
) %>%
  layout(
    title = "Fuel efficiency by weight",
    xaxis = list(title = "Weight (X thousand lbs)"),
    yaxis = list(title = "Miles per gallon"),
    legend = list(title = list(text = "Cylinders"))
  )

This activity only introduces the basics of Plotly, but the package offers many additional tools and visualization types. We encourage you to explore further and try out different interactive features and plot options.

Key Takeaways: - Plotly uses similar concepts as ggplot but with different syntax

Adding Hover Labels, Color, and Customization

So far, we’ve used Plotly to make basic interactive plots where you can zoom and hover. But the default hover labels and colors aren’t always the most helpful.

Plotly lets you:

This combination makes your graph feel more like a little “data app” — each hover tells a small story about that specific car.


Here’s a customized scatterplot using mtcars. We’ll:

plot_ly(
  data = mtcars,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  color = ~factor(cyl),
  colors = "Dark2",
  marker = list(size = 12, opacity = 0.8),
  text = ~paste(
    "Model:", rownames(mtcars),
    "<br>Weight:", wt,
    "<br>MPG:", mpg,
    "<br>Horsepower:", hp,
    "<br>Gears:", gear
  ),
  hoverinfo = "text"
) %>%
  layout(
    title = "Fuel Efficiency with Custom Hover Labels",
    xaxis = list(title = "Weight (thousand lbs)"),
    yaxis = list(title = "Miles per Gallon"),
    legend = list(title = list(text = "Cylinders"))
  )

When you hover over points now, you don’t just see numbers — you get a mini “profile” of each car.


Exercise

Goal: Practice customizing hover labels and styling so the plot tells a clearer story.

  1. Start from the example above (you can copy/paste it).

  2. Make the following three changes:

    • Add at least one different variable to the hover text (e.g., qsec or carb).
    • Change either the color palette or which variable is used for color.
    • Change at least one marker property: size, opacity, or symbol.

Use this chunk as your starting point:

# Exercise: Customize hover labels and styling

plot_ly(
  data = mtcars,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  # TODO: choose your color mapping and palette
  color = ~factor(cyl),
  colors = "Set1",
  marker = list(
    size = 10,      # you can change this
    opacity = 0.9   # and this
  ),
  text = ~paste(
    # TODO: customize your hover text
    "Model:", rownames(mtcars),
    "<br>Weight:", wt,
    "<br>MPG:", mpg
  ),
  hoverinfo = "text"
) %>%
  layout(
    title = "Your Customized Interactive Plot",
    xaxis = list(title = "Weight (thousand lbs)"),
    yaxis = list(title = "Miles per Gallon")
  )

Short reflection (answer in text under the chunk):


Filter Function & Zooming

Interactivity isn’t just about pretty hover labels — it’s also about controlling which data you see and how closely you look at it.

There are two main ideas here:

  1. Filtering the data before plotting Using dplyr::filter(), you can focus on a subset of the data (e.g., only cars with high MPG, or only 4-cylinder cars). This makes your interactive plot more targeted.

  2. Zooming and panning inside Plotly Once the plot is rendered, you can:

    • Click and drag to zoom into a region.
    • Use toolbar buttons to pan around, zoom in/out, or reset.
    • This helps you explore patterns that are hard to see when everything is crammed together.

Combining filtering + zooming lets you move between “big picture” and “close-up” views of your dataset.


Example A: Filter to only 4-cylinder cars

mtcars_4cyl <- mtcars %>%
  filter(cyl == 4)

plot_ly(
  data = mtcars_4cyl,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  marker = list(size = 10),
  color = ~factor(gear)
) %>%
  layout(
    title = "Filtered Plot: Only 4-Cylinder Cars",
    xaxis = list(title = "Weight"),
    yaxis = list(title = "Miles per Gallon"),
    legend = list(title = list(text = "Gears"))
  )

Example B: Zooming on an unfiltered plot

plot_ly(
  data = mtcars,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  color = ~factor(cyl)
) %>%
  layout(
    title = "Try Zooming and Panning (Drag to zoom, double-click to reset)",
    xaxis = list(title = "Weight"),
    yaxis = list(title = "Miles per Gallon")
  )

Try: click-and-drag to zoom into a cluster of points, then double-click to reset.


Exercise

Goal: Use both filtering and zooming to explore a subset of cars more deeply.

  1. Use dplyr::filter() to pick one subset of interest, such as:

    • Cars with mpg > 25
    • Cars with hp > 150
    • Only 6-cylinder cars (cyl == 6)
    • Only manual transmission cars (am == 1)
  2. Make an interactive scatterplot of wt vs mpg for that subset.

  3. Interact with it:

    • Zoom into a region.
    • Hover over several points.
  4. Write 1–2 sentences about something you saw that you might not have noticed in the full dataset.

Scaffolded code:

# Exercise: Filter + zoom exploration

# 1. Filter the dataset (change this line to your own filter condition)
mtcars_subset <- mtcars %>%
  filter(mpg > 25)   # <-- edit this condition

# 2. Make an interactive scatterplot
plot_ly(
  data = mtcars_subset,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers",
  color = ~factor(cyl),
  marker = list(size = 10)
) %>%
  layout(
    title = "Filtered & Zoomable Plot",
    xaxis = list(title = "Weight"),
    yaxis = list(title = "Miles per Gallon")
  )
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

Underneath, have them answer:

Reflection:


AI useage - AI assistance was used to improve grammar, clarity, and spelling.